<?php
/**
 * <https://y.st./>
 * Copyright © 2017 Alex Yst <mailto:copyright@y.st>
 * 
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 * 
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 * 
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see <https://www.gnu.org./licenses/>.
**/

$xhtml = array(
	'title' => 'Learning Journal',
	'subtitle' => 'CS 2205: Web Programming 1',
	'copyright year' => '2017',
	'body' => <<<END
<h2 id="Unit1">Unit 1</h2>
<p>
	School started back up this week.
	I can already tell I&apos;m going to love <span title="Web Programming 1">CS 2205</span>.
	The first discussion forum activity is to read articles about and write about privacy issues on the Internet.
	The first assignment is to find three websites, run them through a validator, and report on their errors.
	The first learning journal assignment is similar to that one I like where we summarise the week&apos;s learning, but without the inclusion of our discussion posts in the journal entry submission.
	Score on all three fronts!
	Privacy on the Internet is a huge deal for me, but it feels like most people ignore this issue.
	Not only do I get to state my case, but every other student&apos;s going to have to do research too and maybe they&apos;ll start caring at least a little more.
	If I make a convincing argument, maybe I can help tip the scales.
	With any luck, other students will also have convincing argument I can add to my own to help me better explain my position in the future as well.
	I knew I needed to work in something about $a[Tor] and the $a[NSA], at a minimum.
	Code validation has always been a thing of mine as well.
	I believe in writing accurate and valid code.
	I believe when Web browsers started accepting and attempting to parse invalid markup, they did the entire Web a huge disservice.
	It is for that reason that I always write in $a[XHTML], not $a[HTML], as Web browsers tend to throw errors and alert me to mistakes in $a[XHTML].
	Web browsers don&apos;t catch <strong>*all*</strong> errors in $a[XHTML], but they do catch the $a[XML] well-formed-ness errors that result from most of my typos.
	Whenever I&apos;m working with more-complex pages, I also always run my pages through the $a[W3C] validator, and I&apos;m a bit appalled that most pages on the Web these days can&apos;t even pass a transitional-level validation.
	Meanwhile, I always validate my own pages to the strictest standards; it&apos;s honestly not difficult at all and only a lazy hack would fail to meet even transitional-level standards.
	As for the learning journal assignment, I find I learn a lot better when I&apos;m able to write about what I learn as I go.
	The more detailed I get, the better the information sticks with me.
	Past courses here at University of the People that&apos;ve had had this type of learning journal assignment have had the same assignment repeated every week all term, so I&apos;m guessing I&apos;ll be using this effective (effective for me, at any rate; everyone learns differently) learning tool through the end of the course.
</p>
<p>
	The reading assignment for the week was as follows:
</p>
<ul>
	<li>
		<a href="https://computer.howstuffworks.com/internet/basics/internet-versus-world-wide-web.htm">What&apos;s the difference between the Internet and the World Wide Web? | HowStuffWorks</a>
	</li>
	<li>
		<a href="https://www.w3.org/community/webed/wiki/How_does_the_Internet_work">How does the Internet work - Web Education Community Group</a>
	</li>
	<li>
		<a href="https://www.w3.org/community/webed/wiki/The_history_of_the_Web">The history of the Web - Web Education Community Group</a>
	</li>
	<li>
		<a href="https://www.w3.org/community/webed/wiki/The_web_standards_model_-_HTML_CSS_and_JavaScript">The web standards model - HTML CSS and JavaScript - Web Education Community Group</a>
	</li>
	<li>
		<a href="https://www.w3schools.com/website/web_validate.asp">404 - Page not found</a>
	</li>
	<li>
		<a href="https://validator.w3.org/docs/why.html">Why Validate?</a>
	</li>
</ul>
<p>
	One assigned page, properly labelled above, is a <code>404</code> error.
	I&apos;ve only noted it in hopes it&apos;ll be corrected for next term&apos;s students.
</p>
<p>
	First on the reading list was an article about the differences between the Internet and the Web.
	Yes!
	This should be a mandatory study topic for all Web users.
	I happen to know someone that often likes to talk about things they have no actual knowledge about.
	Sometimes, I&apos;ll mention the Web or the Internet in a conversation with them.
	I might say something applies to the Web, realise it&apos;s not limited to the Web, and correct myself to say it applies to the Internet in general.
	Likewise, I sometimes say something about the Internet, then realise it only applies to the Web, and correct myself.
	In either case, they claim my correction is unnecessary as they&apos;re the same thing.
	I&apos;ve tried explaining that the Web runs over the Internet but isn&apos;t the Internet, and I&apos;ve tried explaining the differences between the two.
	It&apos;s not a difficult concept to grasp: the Internet is the network of interconnected computers, while the Web is one of many services (which also includes email, $a[IRC], $a[XMPP], $a[SIP], $a[VoIP], and many others) that run on the Internet.
	They still refuse to believe they&apos;re not the same thing though.
	Then again, they also refuse to believe that if you flip two coins, there&apos;s a 75% chance of getting at least one heads.
	They insist that it&apos;s a 100% chance, but that you&apos;re not guaranteed to get a heads.
	Probabilities don&apos;t work that way.
	If something&apos;s not guaranteed to happen, you don&apos;t have a 100% chance of it happening and you don&apos;t get to just add probabilities together like that; you&apos;ve got to multiply them.
	They&apos;re certainly not someone to listen to when it comes to understanding how things actually work.
	While I think this should be mandatory study for Web users, the specific article assigned by the course is full of misinformation and shouldn&apos;t be the one studied by Web users that don&apos;t already know what they&apos;re talking about.
</p>
<p>
	The article says that as the Internet is more easily understood than the Web, we should start with explaining the Web.
	Of course the Internet is easier to understand; the Web uses the Internet as one of its components!
	Understanding the Web without understanding the Internet is like trying to understand what a bicycle is without having any idea what a wheel is.
	You can learn about bicycles as a whole without first learning about wheels, but by the time you&apos;ve finished understanding the bicycle, you&apos;re going to have a firm grasp on the concept of wheels and won&apos;t need to discuss them separately, at least not at length.
	If you want to discuss the differences between a bicycle and a wheel, you&apos;re going to want to start with the wheel.
	The article claims though that the Web is a system used to access the Internet.
	That&apos;s not <strong>*at all*</strong> true.
	The Web uses the Internet as a transport layer for relaying Web pages, other files, and client requests for Web pages and other files, but the Web is <strong>*not*</strong> a gateway to the Internet.
	If the Web were a way of accessing the Internet, for starters, it&apos;s provide Internet access.
	However, it does not.
	Different software components are needed to provide that access, and none of those components are related to the Web.
	Internet access must first be established before Web access is even a <strong>*possibility*</strong>.
	Second of all, if the Web provided Internet access, one would be able to access all (or at least most) of the Internet through the Web.
	That&apos;s not how it works at all though.
	Only files made available over the Web can be accessed over the Web, and as many parts of the Internet have nothing to do with files, the Web has no chance of offering access to these parts except through gateway servers (which means that other services are <strong>*translated*</strong> to be transferred over the Web and can&apos;t be considered to be the Web actually being able to handle the native form of these services).
	If anything, the Internet is subdivided into several different services.
	The underlying $a[TCP]/$a[IP] (or $a[UDP]) protocol is flexible and can allow new services (and thus new pieces of the Internet) to be added on, but each existing service is only one small part of the Internet.
	No service acts as a gateway to the content of the Internet as a whole.
	Web-based gateways to services such as email or $a[IRC] can allow someone to interact with other parts of the Internet through the Web, but these other parts are strictly mutually exclusive to the Web and there isn&apos;t overlap.
	These gateway services are possible only because the gateway server runs two types of services.
	One service is used to gather information that that same server reformats and provides via the other service.
	(That said, the Web, even though it&apos;s a subdivision of the Internet, can be subdivided further.
	Some services such as CardDAV run on on top of the Web.)
</p>
<p>
	The article on how the Internet works brings up a couple interesting points.
	First, it says that $a[URI]s that use domain names as their host component act as aliases for $a[URI]s that instead use $a[IP] addresses.
	However, that&apos;s only half the story.
	$a[DNS] allows the translation of domain names into $a[IP] addresses.
	In this way, the domain names do sort of act as aliases for those $a[IP] addresses.
	Still, this is an extreme oversimplification even as far as $a[DNS] is concerned.
	However, the $a[HTTP] protocol is set up such that an <code>https:</code>- or <code>http:</code>-scheme $a[URI] that uses a domain name is <strong>*not*</strong> an alias for a similar $a[URI] that instead uses an $a[IP] address!
	First off, one domain name can point to multiple $a[IP] addresses.
	The Web server at each $a[IP] address could be serving a different website.
	One $a[URI] cannot be an alias for the $a[URI]s of multiple, completely-different pages.
	Using the $a[IP]-address-based $a[URI]s, you can choose specifically which site you want to visit, but with the domain name, you&apos;re left with the Web browser and the $a[DNS] server making that choice for you.
	I think the Web browser usually tries the first-listed $a[IP] address first, but the $a[DNS] server can be programmed to rotate through the $a[IP] addresses, changing the order, for a round robin effect.
	In practice, this case of multiple websites with the same domain isn&apos;t seen much, but it&apos;s a good example of why these aren&apos;t true aliases.
	A second example is more realistic though.
	The $a[HTTP] protocol, at least version 1.1 (the latest version) of it, includes the host name in the headers.
	The same server can send different websites based on whether you used the $a[URI] with the domain name or the one with the $a[IP] address!
	Back when I ran my own Web server, I set it up so that the website at the $a[IP]-address-based $a[URI] always redirected to the domain-based $a[URI], but I could&apos;ve instead served a second website there.
	Speaking of second websites, a Web server can send a different website based on <strong>*which*</strong> domain is used!
	Let&apos;s take my website for example.
	My name is Alex Yst, and my domain looks like my surname: <code>y.st.</code>.
	The $a[URI] of my homepage is <a href="https://y.st./"><code>https://y.st./</code></a>.
	The domain name resolves to <code>51.254.73.48</code> and <code>2001:41d0:c:b19:0:0:0:10</code>.
	If you try to load <a href="https://51.254.73.48/"><code>https://51.254.73.48/</code></a>, you instead see the website of my friend Opal!
	Why is that?
	Well, Opal&apos;s website, normally reached at <a href="https://wowana.me/"><code>https://wowana.me/</code></a>, is on that same server; she hosts both websites.
	<code>https://y.st./</code> cannot be an alias of <code>https://51.254.73.48/</code>, because the two $a[URI]s correspond to completely different pages!
	(Ostensibly, if you were to load the page at <a href="https://[2001:41d0:c:b19:0:0:0:10]/"><code>https://[2001:41d0:c:b19:0:0:0:10]/</code></a>, that&apos;d also load Opal&apos;s page and not mine, but I don&apos;t have $a[IPv6] service here at home to test that $a[URI] with.)
</p>
<p>
	Second, the article says domain names are much more human-memorable.
	This is only one benefit of using domain names instead of directly using $a[IP] addresses, but it&apos;s an important one.
	It&apos;s one of the main reasons I think the telephone number system is incredibly poorly set up.
	We should be using $a[DNS] for telephone numbers too, not just $a[IP] addresses, and the current $a[DNS] structure could handle it (in the form of TXT records, maybe with host names in the form of <code>_telephone.example.com.</code> to refer to the telephone number that <code>example.com</code> refers to), but telephone makers haven&apos;t bothered to set that up.
	Instead, the telephone numbers, which were originally like $a[IP] addresses in purpose, were completely reworked.
	Telephone numbers now refer to entries in a lookup table, like $a[DNS] names, but without the readability, without the subdomainability, and without the ability for someone outside the telephone service industry to own and/or reserve a name.
	Telephone numbers now have the disadvantages of $a[IP] addresses (they&apos;re not easily human readable/memorable) and the disadvantages of domain names (computers have to look up what the number represents instead of using it directly), without hardly any advantages of either system.
	It&apos;s a real mess, and no one seems to care but me.
</p>
<p>
	The section on types of Web content is a bit misinformed.
	I&apos;d quote it here, but the license is incompatible with my own due to the non-commercial requirement of the original document (I archive all my <a href="https://y.st./en/coursework/">coursework</a> on my website, released under the $a[GNU] {$a['GPLv3+']}, which allows commercial reuse).
	Basically though, it says Web files fit into four groups: text files, Web markup/script files ($a[XHTML]/$a[CSS]/JavaScript), server-side scripts, and files that require a browser plugin or non-browser program to read/run.
	So ... where do images fit in?
	Images are often used on the Web, but fit into none of those four groups.
	Furthermore, server-side scripts overlap with all other groups; they&apos;re not a group themselves.
	A server-side script needs to generate a file that&apos;ll be sent to the client, and that file must fit into another group (once we fix the issue of not all Web files being grouped at all).
	Server-side scripting is powerful though, and allows different content to be served under differing circumstances, so I can see why the authors of the wiki article would group it into its own category.
	I&apos;d say there&apos;s two groups of files on the Web.
	First, there&apos;s the standards-based language files mentioned, such as $a[XHTML] files, $a[CSS] files, and JavaScript files.
	And second, there&apos;s files that are simply taken how they are.
	Plain text files aren&apos;t special, and are just part of this second group, which also includes image files and such.
	Files that require a plugin or other application are just things the Web browser isn&apos;t programmed to handle; which files the Web browser handles varies between browsers.
	For most browsers, $a[PDF] files need to be downloaded and read with a separate application.
	However, Firefox allows these to be displayed in-browser.
	(This feature is broken and no error message is even displayed when JavaScript is disabled.)
</p>
<p>
	I have a friend that uses Gopher for their own server instead of $a[HTTP].
	I&apos;ve looked into the Gopher protocol myself, but it doesn&apos;t fit my needs.
	Specifically, it lacks a way to have the client interpret a file as $a[XHTML].
	$a[HTML] can be used, but without $a[XHTML], my files don&apos;t render properly.
	I think $a[HTML] is messy, so I&apos;m not going to switch to it any time soon.
	My pages don&apos;t render correctly when interpreted as $a[HTML] either.
	They&apos;re perfectly valid $a[XHTML] and $a[XML], but they make use of some of $a[XML]&apos;s cleanliness options that basic $a[HTML] simply doesn&apos;t offer.
	I was unaware of the licensing issues, as mentioned in the history article, that caused the shift away from Gopher though.
	I thought the creation of $a[HTTP] was just because a more-flexible protocol was wanted.
	$a[XHTML] didn&apos;t exist at that time, so I knew $a[XHTML] support wasn&apos;t one of the issues in question, but the Gopher protocol does have restrictive limitations in what it can accomplish.
	The existence of the Mosaic $a[HTTP]/Gopher hybrid client was also new information to me.
</p>
<p>
	The browser wars are touched upon, but their severe effect on $a[HTML] isn&apos;t mentioned.
	Because of this war, browsers started attempting to parse and render pages with invalid markup.
	This resulted in lazy Web authors building poorly-coded pages that were malformed.
	To avoid breaking the Web, Web browser vendors to this day allow their browsers to interpret malformed $a[HTML] files, resulting in $a[HTML] being an utter mess.
	The laziness of Web authors still hasn't ended, and it's not going to any time soon.
	The $a[WHATWG] set up a standard for rendering malformed pages &quot;correctly&quot;, so there's no incentive for Web developers to take the proper care they should.
	Instead, Web browser vendors are stuck pouring effort into writing and maintaining code for the sole benefit of lazy developers that don't deserve the help.
	Clean code is something to care about and strive for.
	It is for this reason that I always use $a[XHTML] instead of $a[HTML].
	Web browsers still render pages with tags being nested within tags they shouldn&apos;t be, and they allow invalid tags, but at least my pages are forced to meet the basic $a[XML] well-formed-ness rules before they&apos;ll render, which catches most of my mistakes before I publish live.
	Whenever I publish a page using tags outside the basic set, I also use a validator to ensure the rest of the markup is fine as well.
</p>
<p>
	The interference of the $a[WHATWG] was covered too.
	I&apos;m still a bit peeved with them, myself.
	$a[XHTML]2 was set to be so much cleaner and feature-rich than $a[HTML], but a group of browser vendors that didn&apos;t want to write the code needed to support $a[XHTML]2 banded together to fight against it.
	They&apos;re the reason the Web moved toward $a[HTML]5 instead of $a[XHTML]2, even though $a[HTML] is messy and should&apos;ve been deprecated.
	Both for killing $a[XHTML]2 and for reviving $a[HTML], I think these people moved us in the wrong direction, but their treachery doesn&apos;t stop there.
	They also insist that $a[HTML]5 is a &quot;living standard&quot;, which is basically a rolling release of a specification with not even version numbers to distinguish between different past and present versions of the specification.
	In other words, it&apos;s not even a standard at all!
	It&apos;s a moving target that no one can ever hope to keep up with.
	The $a[W3C] periodically publishes static snapshots of the $a[WHATWG] specification though, so as long as we use the $a[W3C] version of the standard (which the stupid $a[WHATWG] discourages doing), we have a target we can actually hit.
	Currently, the $a[HTML] 5.1 specification is the latest available, but $a[HTML] 5.2 is in the works.
	Additionally and thankfully, the $a[WHATWG] also didn&apos;t fully kill $a[XHTML].
	An alternate, $a[XML]-based syntax for $a[HTML]5 is available, known as $a[XHTML]5.
	I use $a[XHTML] 5.1 for all pages I write, though if $a[XHTML] had been discontinued in its entirety, I would continue using $a[XHTML] 1.1 just to be able to continue using a cleaner markup language than $a[HTML] has become.
	Supposedly, the $a[WHATWG] were trying to preserve backwards compatibility, but this makes zero sense when you think about it.
	<code>&lt;!DOCTYPE&gt;</code> declarations exist for a reason: communicating to the client what version of the language is being used so the document can be correctly rendered.
	Despite the huge differences between $a[XHTML] 1.1 and $a[XHTML]2, the same client can easily distinguish which language version is used and display documents written in both versions completely correctly.
	Additionally, as the $a[WHATWG] mangled the <code>&lt;!DOCTYPE&gt;</code> declaration of $a[HTML]5 to the point that it contains no useful information, they&apos;ve <strong>*broken*</strong> the possibility of compatibility in the future.
	If their &quot;living standard&quot; changes too much, it&apos;ll prevent either new or old documents form being displayed correctly, as there&apos;s no way to distinguish between the two.
	Finally, it&apos;s important to note that when clients are required to fix and render invalid markup, as is required in $a[HTML], it consumes more system resources than are required for rendering markup that can be thrown out in case of malformation.
	The continuation of $a[HTML] is bad news for devices with limited resources, such as mobile devices!
</p>
END
);
